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Abstract. In this paper we study the question of whether or not a static 
search tree should ever be unbalanced. We present several methods to 
restructure an unbalanced k-ary search tree T into a new tree R that 
preserves many of the properties of T while having a height of log fc n + 1 
which is one unit off of the optimal height. More specifically, we show 
that it is possible to ensure that the depth of the elements in R is no 
more than their depth in T plus at most log,. log fc n + 2. At the same time 
it is possible to guarantee that the average access time P(R) in tree R is 
no more than the average access time P(T) in tree T plus 0(log fe P(T)). 
This suggests that for most applications, a balanced tree is always a 
better option than an unbalanced one since the balanced tree has similar 
average access time and much better worst case access time. 



1 Introduction 

The dictionary problem is fundamental in computer science, it asks for a data 
structure that efficiently stores and retrieves data. Binary search trees are simple, 
powerful and commonly used dictionaries. The problem of building static search 
trees has been intensively studied in the past decades. Depending on the perfor- 
mance required one can build a perfectly balanced search tree that guarantees 
an optimal worst-case search time or one can build a biased search tree match- 
ing the entropy bound thereby providing an optimal expected search time. The 
search tree that minimizes the expected search cost can be unbalanced thereby 
behaving badly in the worst-case. Thus one may prefer to build a search tree 
of bounded height, i.e., with a certain guarantee on the worst-case search time 
that also minimizes the expected search time. In this paper we address the issue 
of the increase in the expected search cost imposed by restricting the height of 
the constructed tree. 

Since a search tree T minimizing the expected search cost may behave badly 
in worst-case, one may want to construct another tree R on the same set of 
keys in such a way that the worst-case search time is improved but the expected 
search time does not differ too much from the initial tree. One way to achieve 
this is to guarantee that R has bounded height and that the depth of a key in R 
is not much more than its depth in T. This is known as the restructuring search 
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tree problem. Moreover, the problem of designing such search trees is directly 
related to the design of good codes. Thus the results obtained in this paper on 
search trees also has straightforward applications in coding theory. 

Preliminaries Consider the set xi, X2, ■ ■ ■ , x n of keys contained in a search 
tree T. We are given 2n + 1 weights Pi,P2, ■ ■ ■ ,Pn ancl 9o> Qi, ■ ■ ■ , Qn such that 
Y^i=i Pi + S™=o 1i = 1- Here, p t is the probability to query the key Xi (successful 
search) and % is the probability to query a key lying between Xi and x i+ i 
(unsuccessful search), q and q n are the probabilities to query a key that is less 
or greater, respectively, than any key contained in the tree. 

Static multiway search trees (or fc-ary trees) generalize most of the other 
static search tree structures. A successful search ends up in an internal node of 
a fc-ary tree that contains the requested key. Each internal node of a fc-ary tree 
contains at most k — 1 keys and has between 1 and fc children. An unsuccessful 
search ends up in one of the n + 1 leaves of the fc-ary tree. A leaf in a fc-ary tree 
does not contain any key. The weighted path length of a fc-ary tree T (referred to 
as path length in the remainder of this paper), a measure of the average number 
of nodes traversed during a search, is defined as 



where dx{xi) is the depth in terms of number of links from the root node to the 
internal node containing the key x^, dr(xi-\,Xi) is the depth of the leaf reached 
at the end of the unsuccessful search for a key lying between Xi—i and Xi. In the 
context of binary search trees (when fc = 2) in the comparisons-based model, 
the path length corresponds to the average number of comparisons performed 
during a search. In the external memory model, the path length corresponds to 
the average number of I/Os performed during a search in the case where each 
node is stored as one disk block. Note that this is the usual way to store a 
multiway search tree in external memory. 

1.1 Related work 

Optimal search trees Knuth [12] showed that an optimal binary search tree 
can be built in 0(n 2 ) time using 0(n 2 ) space. Mehlhorn [15] gave an 0(n) time 
algorithm to build a binary search tree that is near-optimal. Concerning the more 
general case of fc-ary trees, Vaishnavi et al. [17] showed that an optimal fc-ary 
tree can be built in 0(fcn 3 ) time. Becker [2] gave an 0(kn a ) time algorithm, with 
a = 2+log fe 2, to build an optimal _B-tree (subclass of fc-ary tree) that satisfies the 
original constraints fixed by Bayer and McCreight [1[. These constraints require 
that every leaf in the -B-tree have the same depth and that every internal node 
contains between fc/2 and fc keys except for the root node. In the remainder of 
this paper, we consider a more general model of fc-ary tree. The only constraint is 
that an internal node contains at most fc — 1 keys. Recently Bose and Douieb [3] 
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presented a method to build a fc-ary tree in 0(n) time (independent of k) that 
gives the best upper bound on the path length of a fc-ary tree and produces a 
near-optimal fc-ary tree for any k > 2. 

The problem of building an optimal search tree when only unsuccessful 
searches occur, i.e., when ~^2™ =1 Pi = 0, is known as the optimal alphabetic search 
tree problem. Hu and Tucker [8] developed an 0(n 2 ) time and 0(n) space al- 
gorithm for constructing an optimal alphabetic binary search tree. This was 
improved by two other algorithms, the first one was by Knuth [11] and the sec- 
ond by Garsia and Wachs [7]. Both algorithms use O(nlogn) time and 0(n) 
space. 

Optimal search trees with restricted height The problem of building an 
optimal binary search tree with restricted maximal height has been addressed 
by Garey [6] . The best algorithms solving this problem have been independently 
developed by Wessner [18] and Itai [9]. They both produce the optimal binary 
search tree, with h as the height restriction, in 0(hn 2 ) time. For the problem 
of building an optimal alphabetic binary search tree with restricted maximal 
height h, Larmore and Przytycka [14] presented a 0(hn log n) time algorithm. 

Restructuring search trees The problem of restructuring a search tree T 
consists of building another tree i?, on the same set of keys, with restricted 
height such that the path length of R is as close as possible to the path length 
of T. The drop of a node x is defined as A(x) = dn{x) — c?x(x). This problem 
was initially posed by Bose. Evans and Kirkpatrick [4] developed a technique to 
restructure a binary search tree T into a tree R of height [log n] + 1 such that 
A(x) < log log n for every node x in T. They also showed that restructuring an 
alphabetic binary search tree can be done with the guarantee that A{x) < 2 for 
every node x. Their work mainly focused on understanding the tradeoff between 
the height restriction of the restructured tree and the worst-case drop realized by 
a node. Gagie [5] gave an alternate way to restructure a binary search tree into 
a tree of height log n + 1 that guarantees a slightly larger worst-case drop but 
aims at reducing the total drop as opposed to the worst case individual drop. He 
provided an algorithm where the path length of the restructured tree R satisfies 
the following P(R) < P{T) + (1 + e) log(P(T) + 1) + log((l/e) + 1) + 2 with 
1 < e < 2. 

1.2 Our results 

We present several methods to restructure a binary search tree that improves 
the previous best upper bounds on both the local drop of an individual node as 
well as the total drop of all nodes. The methods and the proofs are all based on 
a simple but general technique. We show that our method generalizes and are 
the first to study how to restructure multiway search trees (previous work only 
considers binary search trees). Our results are then used to prove new tighter 



upper bounds on the path length of optimal height-restricted multiway search 
trees. 

In Section 2.2, we present new tree restructuring methods that focus on 
reducing the worst-case drop of any given key. We first focus our attention on 
restructuring a given alphabetic fc-ary search tree into another one of height 
log fe n + 1 such that at least a quarter of the leaves do not drop at all, the 
maximum drop realized by all but one of the leaves is at most 1 and exactly 
one leaf drops at most 2 levels. Second, we present a restructuring method for 
the general case of fc-ary search trees that builds another k-ary tree on the same 
keys with a guaranted worst-case drop of at most log fc log fc n. In fact, this method 
potentially gives a better bound since it takes into consideration the balance of 
the initial tree. The more unbalanced the initial tree, the better the guarantee on 
the drop. For example, if the initial tree is a path, then this method guarantees 
that the worst-case drop is at most 1. 

In Section 2.3, we develop a method focused on the relative drop. By this, we 
mean that in the worst case, the amount that a node will drop is proportional to 
its depth in the original tree as opposed to being proportional to the number of 
nodes in the tree. For a given node Xi, the maximum drop is at most \og k (dT(xi)+ 
1) + (1 + e) log fe \og(dr(xi) + 2) + log fc + 1. As a consequence of this, the path 
length of the restructured tree is close to the path length of the initial tree but 
the restructured tree has height at most log fc n + 1. In Section 2.4 we combine 
the worst-case and relative drop approaches to obtain a hybrid method that 
guarantees simultaneously the best upper bounds in term of relative and worst- 
case drop plus a small constant. 

Finally we show in Section 3 how the results on relative node drop can be used 
to obtain tighter upper bounds on the path length of optimal height-restricted 
multiway search trees. 

2 Restructuring multiway search trees 

Restructuring a search tree T consists of building a new tree R, on the same set 
of keys, such that R satisfies a precise constraint on its height. The problem is 
to determine how the tree R differs from T and how it is efficiently constructed. 
The main idea of our approach, similar to [5], is to define a weight distribution 
on the keys based on their depth in the initial tree T. The weights of the keys 
are defined differently depending on what kind of guarantee on the drop we want 
to achieve. We distinguish between two types of guarantees on the drop: local or 
global. A local guarantee specifies the maximum drop realized by any node. A 
global guarantee specifies the maximum increase of the path length. Given these 
newly defined weights, we build a near-optimal search tree using a technique 
described in the next section. 

2.1 Method to construct near-optimal multiway search tees 

We describe a technique to build near-optimal multiway search trees, developed 
by Bose and Dou'ieb [3] and initially inspired from Mehlhorn's technique [15] 



when access probabilities are known. This technique guarantees the best theo- 
retical upper bound on the path length of optimal multiway search trees. Note 
that any other technique to build search trees can be used for the purpose of 
restructuring trees but we use [3] because it guarantees the best properties. 

Let pi,P2, ■ ■ ■ ,p n be the access probabilities of the internal nodes and 
go, Qi, ■ ■ ■ , Q n be the access probabilities of the leaves. Let T' be the tree built 
with the method [3]. The following two lemmas characterize the depth of the 
elements in T", we distinguish the cases where T' has a branching number equal 
to 2 or when it is greater. We define the value m = max{n — 3P, P} — 1 > j — 1 
where P is the number of increasing or decreasing sequences in the access prob- 
ability distribution on the ordered leaves. The value q ran k[i] is the ith smallest 
access probability among the leaves except for the extremal ones (i.e. we exclude 
(—00, a;i) and (x„,oo) from consideration). 

Lemma 1. The depth of the elements in T' satisfy the following 
d T > {xi) < Llog fe — — ^ J for i = l,...,n, 

Pi ~r Qmin 

2 

d T >{xi- l7 Xi) < |k>g fe — J + 1 for i = 0,...,n. 

Qi 

The following lemma is not explicitly described in [3], additional details will 
appear in the journal version of this paper. 

Lemma 2. In the case where k = 2, the depth of the elements in T' satisfy the 
following 

d T ' {xi) < [log 2 — — ^ J for i = l,...,n, 

Pi ~r Qmin 

d T > (xi-i,Xi) < [\og 2 — J + 2 for one leaf (xi-i,Xi), 

Qi 

dr> (xj-i, Xj) < [log 2 — J + 1 for all leafs (xj-i, Xj) ^ (xi-i,xi), 
Qj 

dr'{xj-i,Xj) < [\og 2 — J for at leastm + 2 leaves(xj_i, xj). 

Qj 

Theorem 1 The path length of the tree T' is at most 

jj n rn 

UB(k) = — + 1 + ^ U - QO - Qn - J! Qrank[{\ , 

° 2 i=0 i=0 

where H = ^"=1 lo S2( 1 M) + S"=o Qi 1 °g2( 1 M) is the entropy of the proba- 
bility distribution. In the case k = 2 the path length of T' is at most 

m 

UB{2) = H + l-q a -q n + q max - ^2pq rank[ii , 

i=0 

where the value ml — max{2n — 3P, P} — 1 > ^ — 1, pq ra nk[i] * s the ith smallest 
access probability among every key and every leaf (except the extremal leaves) 
and q ma x is the greatest leaf probability including external leaves. 



2.2 Worst case drop 



In this section we consider the problem of minimizing the maximum drop inde- 
pendently realized by each node. 



Alphabetical tree 

An alphabetic search tree is a tree where only unsuccessful searches occur, i.e., 
when Y^i=\Pi = 0- ^ n order to restructure an alphabetic tree T, we first define 
a weight for each leaf in T based on its depth in T. Namely the weight of a leaf 
node (xi-i,Xi) is defined as 

, . / 1 1 

w(Xi-i,Xi) — max 



Let W — Y^ii=0 x i) which is always strictly smaller than 1 + fjrzjy^ = 

jjJzT) by Kraft's inequality [13]. These weights are used to define the access 
probabilities of each leaf. The access probability of a leaf (xi-\,Xi) is defined 
as qi = w(xi—\, Xi)/W and the access probability of an internal node Xi as 
Pi = 0. These probabilities are then used as input to the algorithm described 
in Section 2.1 to build a near-optimal binary search tree giving the restructured 
tree R on the same keys. 

Theorem 2 An alphabetic multiway tree T can be restructured into a tree R 
such that the height of R is at most log fc n + 1 and the maximum drop of a leaf 
is at most 1 if k > 2. When k = 2 a drop of 2 is realized by only one leaf, the 
drop of any other is at most 1. In general, at least m > j + 2 leafs do not drop. 

Proof. By Lemma 1, the greatest depth reached by an internal node is 
Llog fc J < log fe fc |^~ 1 j" = log fe n + 1. As a consequence the greatest depth of 
a leaf is at most log fe n + 1, which corresponds to the maximum height of the 
restructured tree. 

The depth of a leaf (xi-i,Xi) in the restructured tree R is at most 

2 2fc dT ( x *- 1 ' x *)+ 1 2 
Llog, -J +1< [log k — J + 1 = Llog fc —J + drixi-uxi) + 2. 

Thus for k > 2, the depth of a leaf (xi_i,Xi) is at most dr(xi_i,Xi) + 1 which 
implies a maximum leaf drop of 1. Using Lemma 2, similar arguments verify the 
theorem in the case where k — 2. □ 

So this simple method generalizes to k-ary alphabetic search trees the result of 
Evans and Kirkpatrick [4]. It also gives a more precise guarantee on the maximal 
drop of a leaf in the binary alphabetic search tree case, since we guarantee 
that only one leaf drops two levels, all other leaves drop 1 level with a quarter 
of the leaves not dropping at all. Note that for some binary search trees any 
restructuring method produces a drop of 2 (see [4]). 



General k-ary search tree 

Here the weight of an internal node x\ is defined as follows 



w(xi) 



1 W 



^dT(xi)' (fc- l) n , 

where W = fcd^Ws — — 1) l°gfc n by the generalization of Kraft's in- 

equality's [16]. Let = J27=i w(xi) <W' + = j^W' < klog k n. These 

weights are used to construct a probability distribution on the nodes. The access 
probability of an internal node Xi is pi = w(xi)/W whereas the access proba- 
bility of a leaf is null, i.e., qi = for all leaves. These probabilities are used to 
build the restructured tree R with the technique described in Section 2.1. 

Theorem 3 A multiway search tree T can be restructured into a tree R such 
that the height of R is at most \og k n + 1 and the maximum drop of a node is at 
most [logfe < log fc log fe n. 



Proof. By Lemma 1, the depth of an internal node Xi is at most [logfe ^"J 
w 



[log fc Y \ j . The greatest depth reached by an internal node is 



W (k-l) 

maxl «gfe 7 r < log fe w , = logfe n + 1. 

1 w(Xi) ,, vv ^ 

v *' (fc— 1)71 

As a consequence the greatest depth of leaf is at most log fe n + 1, which corre- 
sponds to the maximum height of the restructured tree. The depth of an internal 
node Xi in the restructured tree R is at most 

Liogfe ij < Liogfe j-^W k d -^\ = d T ( Xl ) + Liogfe + i. 

The maximum drop is [logfe < log fe log fe n for both internal nodes and leaves 
since the drop of a leaf is the same as the drop of its parent (an internal node). 

□ 

This method generalizes to k-ary search trees the result of Evans and Kirk- 
patrick [4] . For the binary search tree case, the worst-case drop guaranteed with 
this method is similar to the one given by Evans and Kirkpatrick. Indeed there 
are some instances for which our method produces a drop of log fe log fc n. But 
for most instances the guarantee is better since our method takes into consid- 
eration the balance of the initial tree. For example if the tree is a list than the 
worst-case drop is constant. The value W is the expression of the balance of the 
initial tree, W is O(l) for a highly unbalanced tree and l?(logn) when the tree 
is unbalanced. 



2.3 Relative drop 

Generally a static unbalanced search tree is needed when frequently accessed 
elements have to be accessed much faster than the other elements. In this con- 
text, if we want to restructure an unbalanced tree in order to satisfy a precise 



constraint on its height, it is important that elements located close to the root 
in the original tree remain close to the root in the restructured tree. To achieve 
this, we bound the maximum drop of an element with respect to its depth in 
the original tree. This optimization differs from the previous one as it aims to 
reduce the global instead of local drop. 

First we define the weight of an internal element 



w(xi) — max 



1 



D( Xi ) (dr(xi) + 1) log 1+e (d T (^) + 2) en(k - 1) 



with 1 < e < 2 and D(xi) is the number of elements at depth dx(xi) in the tree 
T, thus D(xi) < (k — l)k dT( - Xi \ Let W = £™ =1 w(xi) which is strictly smaller 

than J2ti iiogi+^i+i) + 6 n(fc-i) < W^Te- These weights define a probability 
distribution on the nodes so that the access probability of an internal node Xi 
is given by pi = w(xi)/W. We consider the leaves to have an access probability 
of zero, i.e., % = for all leaves. These probabilities are used to build the 
restructured tree R with the technique described in Section 2.1. 

Theorem 4 Define f(y) = \og k y + (1 + e) log fc log(y + 1) + log fc 1±* + 1. A 
multiway search tree T can be restructured into a tree R of height \og k n + 1 
where the drop of an internal node Xi is at most f{dr{xi) + 1) and the drop of 
a leaf (xi-i,Xi) is at most f(dr(xi-i,Xi)) — 1. 

Proof. According to Lemma 1, the depth of a internal node is at most |_log fc ^-J 
[log fc ^^-jj • The greatest depth that an internal node can reach is 



W fk(l + e)en(k-l)\ 

m f Xl ° gfe M^j < l0gfc { (k-l)e (1 + e) ) = l0gfc n 

As a consequence the greatest depth of a leaf is at most log fc n + 1, which corre- 
sponds to the maximum height of the restructured tree. 
The depth of an internal node Xi in R is at most 

Llog fc -^j < Llog fe ^±A D{Xi) {dT{x .) + i) log 1+£ (d T (x 4 ) + 2)J 

< dr(xi) + log k (d T (xi) + 1) + (1 + e) log fe log(d T (x 4 ) + 2) + log k — 



The maximum depth of a leaf in R is the same as the maximum depth of its 
parent node in R. Thus the depth of a leaf (xi-i,Xi) is at most 

d T (xi-i, Xi)-l+log k (d T (xi-i, Zi))+(l+e) log fc log(d T {xi-i, Xi)+l)+log k — — +1. 



□ 



Theorem 5 Define m = [\og k n\ + 1 . A search multiway tree T can be restruc- 
tured into a tree R such that the height of R is at most h ( with h > m) and the 
depth of an internal node Xi satisfies 

dfi(xi) < dr(xi) + f(dr(xi) + 1 — h + m) if h — m < dxixi) < h, 
< dxixi) otherwise. 

For a leaf (xi-i,Xi), 

d R {xi-i,Xi) < d T (xi-i,Xi) + f(d T (xi-i,Xi) - h + m) if h-m< d T (xi-i,Xi) < h, 
<dx(xi-i,Xi) otherwise. 

Proof. Consider the subtrees of T rooted at the elements at depth h — m. Apply 
the restructuring procedure described in the beginning of this section to each of 
those subtrees seen as independent trees. This restructuring does not affect the 
depth of elements at depth strictly smaller than h — m. According to Theorem 4, 
the maximal drop of the other internal nodes xi is proportional to the depth 
inside the subtree that contains them, i.e., dR{xi) < /(drO^i) + 1 — (h — m)). 
The maximum drop of a leave (xi-i,Xi) is at most the maximum drop of its 
parent node, i.e., du(xi-i, Xi) < f{dT{xi-\,x{) — (h — m)). □ 

We show how to restructure a tree T into a tree R with nearly minimum 
height such that the increase of the path length is small. This new restructuring 
tree method slightly improves the result of Gagie [5] and arguably simplifies the 
proof technique (knowledge about relative entropy is not required). Evans and 
Kirkpatrick [4] guaranteed a worst-case drop of log log n. Since this does not take 
into consideration the original depth of the element in the tree, this could lead 
to a situation where the depth of the root in the restructured tree is log log n 
times greater then its depth in the initial tree. 

2.4 Hybrid drop 

The first method presented in Section 2.2 gives the best upper bound on the 
worst case drop which is log fe log fe n. The problem is that the restructured tree 
produced by this method can have a path length which is log fc \og k n times larger 
than the path length of the original tree. The method introduced in Section 2.3 
avoids this problem by guaranteeing a drop that is proportional to the depth 
of the elements in the original tree, but the guarantee on the worst-case drop 
is a bit worst than the previous method. Here we present a hybrid method for 
restructuring a fc-ary search tree that guarantees simultaneously the best upper 
bounds in term of relative and worst-case drop plus a small constant. 

Let d' be the value that satisfies (d' + l) log 1+e (d'+2) = log fe n with 1 < e < 2. 
The weight of an internal node Xi is defined as follows 




for {d T (xi) + 1) < d', 

for d' < (d T (xi) + 1) < log fc n, 

for (d T (xi) + 1) > log fe n. 



The total weight is 

n 



< y- 1 , 1 n(l + 2e) 

" (j + 1) log 1+£ (j + 2) . = ^- +1 log fe n en(fc - 1) 

1 (l + 2e) 

e e(k - 1) 

fc(l + 2e) 



< 



(fc-l)e ' 



Those weights are used to build the restructured tree R with the technique 
described in Section 2.1. The access probability of an internal node Xi is given 
by pi = w(xi)/W whereas the access probability of a leaf is null, i.e., qi — for 
all leaves. 

Theorem 6 A k-aray search tree T can be restructured into a tree R such that 
the height of R is at most log fc n + 1 and the drop of a key Xi is at most 

min{log fe log fc n, \og k (d T ( Xl ) + 1) + (1 + e) log fc log(d T (a; 4 ) + 2)} + log fe i±__ + i. 



Proof. By Lemma 1, the depth of an internal node a;, is at most [logfe j-\ = 
w 

w(xi) - 



[\og k ■ The largest depth reached by an internal node is 



fc(l+2e) 
W (k-l)e 

m f x logfc ^77T < logfe i+2e = lo Sfc n + 1 - 

W y Xl > en(fe-l) 

As a consequence the largest depth of a leaf is at most log fc n+1 which cor- 
responds to the maximum height of the restructured tree. Using the same 
type of argument than in the proof of Theorem 4, an internal node xi with 
(dr(xi) + 1) < d' realizes a drop of at most 

\og k {d T {xi) + 1) + (1 + e) log fc \og{d T ( Xi ) + 2) + log fe \±2± + i 



which is at most log fc log fc n + \og k + 1 by the definition of d! . An internal 
node Xi with d' < (dr(xi) + 1) < log fe n realizes a drop of at most 

1 + 2e 

iogfc lo gfc n + logfc + 1 



which is at most \og k (d T {xi) + 1) + (1 + e) log,, \og(d T (xi) + 2) + log fc + 1 
by the definition of d'. 

a 



3 Applications 



Nice applications of the results provided in the Section 2.3 about the relative 
drop occurs in the context of building optimal height-restricted multiway search 
trees. We are interested in measuring the maximum increase of the path length 
imposed by a height restriction. We investigate the difference between the path 
length of the optimal multiway tree and the optimal multiway tree with a height 
restriction. We give the best upper bound on the path length of an optimal 
multiway tree with a height restriction. Note that to prove the bound we assume 
that the access probabilities to the nodes and leaves are given. 

Theorem 7 Consider T* the optimal multiway tree built over the set of keys 
d let T£ define the optimal multiway tree build on the same set of 
keys and such that its height is no more than h > [log k n\ + 1 . The following is 
always satisfied 

P{U) < P(T*) + /(max{l, P(T*) -h + m}), 

where f(y) = log fc y + (1 + e) log fc log(y + 1) + log fe ±±« + 1 and m ^ |l°gfc "J + 1 ■ 

Proof. Using the method described in Section 2.3 we can restructure T* into the 
tree Rh which has a maximum height h. By definition we have P(Tfc) < P(Rh)- 
Using Theorem 5 and by Jensen's inequality [10] we show 

n n 

P{Rh) = ^2pi(d Rh (xi) + 1) +^2q i d Rh (x i - 1 ,x i ) 

i=l i=0 
n 

< ^piidr* (xi) + 1 + f(m&x{l,d T *(xi) + 1 - h + to})) 
i=i 

n 

+ y^qi(d T * {xj-i,Xj) + /(max{l,d T »(xi_i,Xi) -h + m})) 

j=0 

< P(T*) + /(max{l, P(T*) - (h - to)}). 

□ 

Among other things this theorem states that a height restricted optimal mul- 
tiway tree has a path length that differs from the optimal path length P{T*) 
without the height restriction by roughly 21og fe P(T*) (even if the height re- 
striction is nearly maximum, i.e., logn + 1). This casts doubt on the necessity 
of using unbalanced search trees. 

Theorem 8 There exists a linear running time algorithm which builds a multi- 
way search tree Rh with a height smaller than h > [\og k n\ + 1 and such that 

P{Rh) < UB(k) + /(max{l, UB{k) -h + m}) 

where UB(k) is defined in Theorem 1. 

Proof. We use the technique described in Section 2.1 to build a near-optimal 
multiway search tree T in 0(n) time. This guarantees that P(T) < UB(k). 
Then we restructure T into Rh in 0(n) time using the technique developed in 
Section 2.3. We can deduce from Theorem 5 the correctness of the theorem. □ 
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